362 research outputs found

    Structure- and context-based analysis of the GxGYxYP family reveals a new putative class of glycoside hydrolase.

    Get PDF
    BackgroundGut microbiome metagenomics has revealed many protein families and domains found largely or exclusively in that environment. Proteins containing the GxGYxYP domain are over-represented in the gut microbiota, and are found in Polysaccharide Utilization Loci in the gut symbiont Bacteroides thetaiotaomicron, suggesting their involvement in polysaccharide metabolism, but little else is known of the function of this domain.ResultsGenomic context and domain architecture analyses support a role for the GxGYxYP domain in carbohydrate metabolism. Sparse occurrences in eukaryotes are the result of lateral gene transfer. The structure of the GxGYxYP domain-containing protein encoded by the BT2193 locus reveals two structural domains, the first composed of three divergent repeats with no recognisable homology to previously solved structures, the second a more familiar seven-stranded β/α barrel. Structure-based analyses including conservation mapping localise a presumed functional site to a cleft between the two domains of BT2193. Matching to a catalytic site template from a GH9 cellulase and other analyses point to a putative catalytic triad composed of Glu272, Asp331 and Asp333.ConclusionsWe suggest that GxGYxYP-containing proteins constitute a novel glycoside hydrolase family of as yet unknown specificity

    The Yeast YPD1/SLN1 Complex Insights into Molecular Recognition in Two-Component Signaling Systems

    Get PDF
    AbstractIn Saccharomyces cerevisiae, a branched multistep phosphorelay signaling pathway regulates cellular adaptation to hyperosmotic stress. YPD1 functions as a histidine-phosphorylated protein intermediate required for phosphoryl group transfer from a membrane-bound sensor histidine kinase (SLN1) to two distinct response regulator proteins (SSK1 and SKN7). These four proteins are evolutionarily related to the well-characterized “two-component” regulatory proteins from bacteria. Although structural information is available for many two-component signaling proteins, there are very few examples of complexes between interacting phosphorelay partners. Here we report the first crystal structure of a prototypical monomeric histidine-containing phosphotransfer (HPt) protein YPD1 in complex with its upstream phosphodonor, the response regulator domain associated with SLN1

    Between Attention and Portfolio Adjustment: Insights from Machine Learning-based Risk Preference Assessment

    Get PDF
    Financial firms recommend products to customers, intending to gain their attention and change their portfolios. Based on behavioral decision-making theory, we argue attention’s effect on portfolio adjustment is through the risk deviation between portfolio risk and their risk preference. Thus, to fully understand the adjustment process, it is necessary to assess customers’ risk preferences. In this study, we use machine learning methods to measure customers’ risk preferences. Then, we build a dynamic adjustment model and find that attention’s impact on portfolio adjustment speed is stronger when customers’ risk preference is higher than portfolio risk (which needs an upward adjustment) and when customers’ risk preference is within historical portfolio risk experience. We conducted a field experiment and found that directing customers’ attention to products addressing the risk deviation would lead to more portfolio adjustment activities. Our study illustrates the role of machine learning in enhancing our understanding of financial decision-making

    Any-Size-Diffusion: Toward Efficient Text-Driven Synthesis for Any-Size HD Images

    Full text link
    Stable diffusion, a generative model used in text-to-image synthesis, frequently encounters resolution-induced composition problems when generating images of varying sizes. This issue primarily stems from the model being trained on pairs of single-scale images and their corresponding text descriptions. Moreover, direct training on images of unlimited sizes is unfeasible, as it would require an immense number of text-image pairs and entail substantial computational expenses. To overcome these challenges, we propose a two-stage pipeline named Any-Size-Diffusion (ASD), designed to efficiently generate well-composed images of any size, while minimizing the need for high-memory GPU resources. Specifically, the initial stage, dubbed Any Ratio Adaptability Diffusion (ARAD), leverages a selected set of images with a restricted range of ratios to optimize the text-conditional diffusion model, thereby improving its ability to adjust composition to accommodate diverse image sizes. To support the creation of images at any desired size, we further introduce a technique called Fast Seamless Tiled Diffusion (FSTD) at the subsequent stage. This method allows for the rapid enlargement of the ASD output to any high-resolution size, avoiding seaming artifacts or memory overloads. Experimental results on the LAION-COCO and MM-CelebA-HQ benchmarks demonstrate that ASD can produce well-structured images of arbitrary sizes, cutting down the inference time by 2x compared to the traditional tiled algorithm

    Image Fusion Based on Nonsubsampled Contourlet Transform and Saliency-Motivated Pulse Coupled Neural Networks

    Get PDF
    In the nonsubsampled contourlet transform (NSCT) domain, a novel image fusion algorithm based on the visual attention model and pulse coupled neural networks (PCNNs) is proposed. For the fusion of high-pass subbands in NSCT domain, a saliency-motivated PCNN model is proposed. The main idea is that high-pass subband coefficients are combined with their visual saliency maps as input to motivate PCNN. Coefficients with large firing times are employed as the fused high-pass subband coefficients. Low-pass subband coefficients are merged to develop a weighted fusion rule based on firing times of PCNN. The fused image contains abundant detailed contents from source images and preserves effectively the saliency structure while enhancing the image contrast. The algorithm can preserve the completeness and the sharpness of object regions. The fused image is more natural and can satisfy the requirement of human visual system (HVS). Experiments demonstrate that the proposed algorithm yields better performance

    Two Pfam protein families characterized by a crystal structure of protein lpg2210 from Legionella pneumophila.

    Get PDF
    BackgroundEvery genome contains a large number of uncharacterized proteins that may encode entirely novel biological systems. Many of these uncharacterized proteins fall into related sequence families. By applying sequence and structural analysis we hope to provide insight into novel biology.ResultsWe analyze a previously uncharacterized Pfam protein family called DUF4424 [Pfam:PF14415]. The recently solved three-dimensional structure of the protein lpg2210 from Legionella pneumophila provides the first structural information pertaining to this family. This protein additionally includes the first representative structure of another Pfam family called the YARHG domain [Pfam:PF13308]. The Pfam family DUF4424 adopts a 19-stranded beta-sandwich fold that shows similarity to the N-terminal domain of leukotriene A-4 hydrolase. The YARHG domain forms an all-helical domain at the C-terminus. Structure analysis allows us to recognize distant similarities between the DUF4424 domain and individual domains of M1 aminopeptidases and tricorn proteases, which form massive proteasome-like capsids in both archaea and bacteria.ConclusionsBased on our analyses we hypothesize that the DUF4424 domain may have a role in forming large, multi-component enzyme complexes. We suggest that the YARGH domain may play a role in binding a moiety in proximity with peptidoglycan, such as a hydrophobic outer membrane lipid or lipopolysaccharide

    TRANSOM: An Efficient Fault-Tolerant System for Training LLMs

    Full text link
    Large language models (LLMs) with hundreds of billions or trillions of parameters, represented by chatGPT, have achieved profound impact on various fields. However, training LLMs with super-large-scale parameters requires large high-performance GPU clusters and long training periods lasting for months. Due to the inevitable hardware and software failures in large-scale clusters, maintaining uninterrupted and long-duration training is extremely challenging. As a result, A substantial amount of training time is devoted to task checkpoint saving and loading, task rescheduling and restart, and task manual anomaly checks, which greatly harms the overall training efficiency. To address these issues, we propose TRANSOM, a novel fault-tolerant LLM training system. In this work, we design three key subsystems: the training pipeline automatic fault tolerance and recovery mechanism named Transom Operator and Launcher (TOL), the training task multi-dimensional metric automatic anomaly detection system named Transom Eagle Eye (TEE), and the training checkpoint asynchronous access automatic fault tolerance and recovery technology named Transom Checkpoint Engine (TCE). Here, TOL manages the lifecycle of training tasks, while TEE is responsible for task monitoring and anomaly reporting. TEE detects training anomalies and reports them to TOL, who automatically enters the fault tolerance strategy to eliminate abnormal nodes and restart the training task. And the asynchronous checkpoint saving and loading functionality provided by TCE greatly shorten the fault tolerance overhead. The experimental results indicate that TRANSOM significantly enhances the efficiency of large-scale LLM training on clusters. Specifically, the pre-training time for GPT3-175B has been reduced by 28%, while checkpoint saving and loading performance have improved by a factor of 20.Comment: 14 pages, 9 figure
    • …
    corecore